[cDAC] Stackwalk DacDbi APIs#129091
Conversation
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
There was a problem hiding this comment.
Pull request overview
This PR expands the cDAC StackWalk and Debugger contracts to support runtime hijack-stub detection and hijacked-context recovery, updates context retrieval to use caller-provided aligned buffers with HRESULT-based error reporting, and wires up corresponding DacDbi stack-walk APIs with updated tests and documentation.
Changes:
- Add Debugger contract support for hijack stub range table (
RgHijackFunction+MemoryRange) and implementIDebugger.IsRuntimeUnwindableStub. - Extend StackWalk contract with hijack-context recovery and seed-context stack-walk support; change
GetContextto fill a caller-provided, 16-byte-aligned buffer and return HRESULTs. - Implement DacDbi stack-walk methods (create/delete/set/get/unwind/check) on top of the cDAC contracts, with test and dump-test updates for alignment and signature changes.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/UnitTests/DebuggerTests.cs | Adds unit tests and mock-target plumbing for hijack-stub detection via RgHijackFunction/MemoryRange. |
| src/native/managed/cdac/tests/UnitTests/DacDbiImplTests.cs | Updates tests to match CheckContext signature change (byte* instead of nint). |
| src/native/managed/cdac/tests/TestInfrastructure/TestPlaceholderTarget.cs | Updates TryGetThreadContext override to return HRESULT (int). |
| src/native/managed/cdac/tests/DumpTests/StackWalkDumpTests.cs | Updates dump test to use aligned native memory and validate HRESULT-based GetContext. |
| src/native/managed/cdac/tests/DumpTests/DacDbi/DacDbiStackWalkDumpTests.cs | Updates dump tests to use aligned buffers when calling contract GetContext. |
| src/native/managed/cdac/tests/DataGenerator/TestTarget.cs | Updates TryGetThreadContext override to return HRESULT (int). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader/ContractDescriptorTarget.cs | Changes TryGetThreadContext to return HRESULT from the underlying data-target delegate. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/IDacDbiInterface.cs | Updates COM surface signatures for stack-walk context pointers (byte*) and adds CorDebugSetContextFlags. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/Helpers/StackWalkHandleData.cs | Introduces managed handle state for cDAC-backed DacDbi stack walking. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Legacy/Dbi/DacDbiImpl.cs | Implements DacDbi stack-walk APIs using cDAC StackWalk + Debugger contracts, with optional legacy mirroring/debug validation. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Microsoft.Diagnostics.DataContractReader.Contracts.csproj | Adds System.HResults to enable HRESULT constants in contracts. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/DataType.cs | Adds DataType.MemoryRange for hijack table entries. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/MemoryRange.cs | Adds cDAC MemoryRange data descriptor type (start + size). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Data/Debugger.cs | Adds RgHijackFunction field for hijack function range table pointer. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/StackWalk_1.cs | Adds seed-context CreateStackWalk, hijacked-context recovery, HRESULT-buffer GetContext, and aligned scratch usage for OS context retrieval. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/StackWalk/Context/IPlatformAgnosticContext.cs | Adds ContextAlignment constant (16) for CONTEXT buffer alignment requirements. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Contracts/Debugger_1.cs | Implements IDebugger.IsRuntimeUnwindableStub using hijack range table scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Contracts/Constants.cs | Adds MaxHijackFunctions global name used by Debugger hijack-table scanning. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Target.cs | Changes TryGetThreadContext to return HRESULT instead of bool (API contract shift). |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IStackWalk.cs | Adds seed-context CreateStackWalk overload and changes GetContext to HRESULT + caller buffer; adds RetrieveHijackedContext. |
| src/native/managed/cdac/Microsoft.Diagnostics.DataContractReader.Abstractions/Contracts/IDebugger.cs | Adds IsRuntimeUnwindableStub API to the Debugger contract abstraction. |
| src/coreclr/vm/datadescriptor/datadescriptor.inc | Exposes Debugger RgHijackFunction, introduces MemoryRange type descriptor, and adds MaxHijackFunctions global literal. |
| src/coreclr/inc/memoryrange.h | Adds cDAC exposure (cdac_data<MemoryRange>) for MemoryRange field offsets. |
| src/coreclr/debug/ee/debugger.h | Annotates hijack enum constant dependency; exposes hijack table pointer and constant via cdac_data<Debugger>. |
| src/coreclr/debug/di/rsstackwalk.cpp | Avoids fetching current context after unwind when already at end-of-stack. |
| docs/design/datacontracts/StackWalk.md | Updates contract documentation for new StackWalk APIs and aligned, caller-buffer GetContext. |
| docs/design/datacontracts/Debugger.md | Documents IsRuntimeUnwindableStub and the hijack-table globals/data descriptors. |
Copilot's findings
- Files reviewed: 27/27 changed files
- Comments generated: 6
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| handleData = new StackWalkHandleData(_target.Contracts.StackWalk, threadData); | ||
| SeedHandleFromNativeContext(handleData, pInternalContextBuffer, isFirst: true); | ||
| *ppSFIHandle = handleData.GetHandle(); | ||
| } | ||
| catch (System.Exception ex) | ||
| { | ||
| hr = ex.HResult; | ||
| } |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| | --- | --- | --- | --- | | ||
| | `ExceptionFlags` (`exstatecommon.h`) | `Ex_UnwindHasStarted` | `0x00000004` | Bit flag in `ExceptionInfo.ExceptionFlags` indicating exception unwinding (2nd pass) has started. Used by `IsInStackRegionUnwoundBySpecifiedException` to skip ExInfo trackers still in the 1st pass. | | ||
| | `InlinedCallFrameMarker` (`exceptionhandling.h`) | `ExceptionHandlingHelper` | `2 (64-bit), 1(32-bit)` | Used to determine whether an active call on an InlinedCallFrame is an EH helper. | | ||
| | `Debugger::s_hijackFunction` (`debugger.h`) | `UnhandledExceptionHijackIndex` | `0` | Index of the unhandled-exception hijack entry in the runtime's hijack-function table. Used by `RetrieveHijackedContext` to choose between the SP-based and FP/SP-offset-based CONTEXT recovery paths, and by `IDebugger.IsRuntimeUnwindableStub` to set its `isUnhandledException` out parameter. | |
| | `CLRJitAttachState` | TargetPointer | Pointer to the CLR JIT attach state flags | | ||
| | `CORDebuggerControlFlags` | TargetPointer | Pointer to `g_CORDebuggerControlFlags` | | ||
| | `MetadataUpdatesApplied` | TargetPointer | Pointer to the g_metadataUpdatesApplied flag | | ||
| | `MaxHijackFunctions` | uint32 | Number of entries in the hijack function array. Zero on platforms/configurations where the hijack-function table is not present. | |
| /// <returns>true if successful, false otherwise</returns> | ||
| public abstract bool TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> buffer); | ||
| /// <returns>HRESULT indicating success or failure</returns> | ||
| public abstract int TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> buffer); |
There was a problem hiding this comment.
I'd like to be consistent about the return type across the reading APIs. Is there a reason we need change behavior on E_NOTIMPL vs another failure code?
There was a problem hiding this comment.
Yes. If the failure code is E_NOTIMPL it means we have to try again via the Frames. If the failure code is anything else (for example, E_FAIL from improperly aligned bytes), we should not retry.
There was a problem hiding this comment.
Are there actual cases when GetThreadContext will be E_NOTIMPL? Or can we always fall back on failure? I'm wondering if we can simplify this.
There was a problem hiding this comment.
Yes, I tried to remove the E_NOTIIMPL case and then we realized it was required for non-Windows. #128499
| // Required alignment, in bytes, of any memory buffer passed to Win32 GetThreadContext | ||
| // (or its cross-platform equivalents) as a CONTEXT*. | ||
| public const uint ContextAlignment = 16; |
There was a problem hiding this comment.
I'm not sure the cDAC should care about this. I think the underlying reader should account for this.
There was a problem hiding this comment.
How do you think the reader should account for this? The alignment is either the callers responsibility, or it is the callee’s responsibility which means it would have to allocate aligned bytes and copy. It seems wasteful to copy and allocate just for the alignment, as opposed to the caller alignment we have now by wrapping the CONTEXT structs in declspec(16).
| byte[] contextBuf = new Span<byte>(pContext, (int)contextSize).ToArray(); | ||
| handleData.Reset(contextBuf, isFirst); |
There was a problem hiding this comment.
The underlying calls here don't require a byte[], this is eventually fed into IPlatformAgnosticContext.FillFromBuffer(Span<byte> buffer). Can we change the intermediary calls to accept a Span so we don't need to allocate?
| } | ||
|
|
||
| byte[] IStackWalk.GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags) | ||
| int IStackWalk.GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags, Span<byte> contextBuffer) |
There was a problem hiding this comment.
I don't think we have any other contract layer APIs that use HResults. I'm wondering if we can avoid HResults in the pure managed code, if we can't I think we should create a typed enum to better express the meaning and avoid magic constants.
There was a problem hiding this comment.
On second thought, I don't think we need to deal with the span here. The only caller that uses the span still converts it to an array with SeedHandleFromNativeContext.
There was a problem hiding this comment.
What do you mean by "deal with the span"?
There was a problem hiding this comment.
We could keep this API returning a byte[] and copy into the user provided span inside of the DacDbiImpl.cs
There was a problem hiding this comment.
We could also have it return a caller-allocated Span<> and not have to copy.
…er.Abstractions/Contracts/IStackWalk.cs Co-authored-by: Max Charlamb <44248479+max-charlamb@users.noreply.github.com>
| IPlatformAgnosticContext context = IPlatformAgnosticContext.GetContextForPlatform(_target); | ||
| byte[] bytes = new byte[context.Size]; | ||
| Span<byte> buffer = new Span<byte>(bytes); | ||
| if (contextBuffer.Length < context.Size) | ||
| return unchecked((int)0x80070057); // E_INVALIDARG | ||
| Span<byte> buffer = contextBuffer.Slice(0, (int)context.Size); |
|
|
||
| // Fall back to deriving a context from the explicit Frame chain stored in the Thread object. | ||
| return GetContextFromFrames(threadData); | ||
| else if ((uint)hr == 0x80004001 /* E_NOTIMPL */) | ||
| { | ||
| WriteContextFromFrames(threadData, buffer); | ||
| return 0; | ||
| } | ||
| return hr; |
| IEnumerable<IStackDataFrameHandle> IStackWalk.CreateStackWalk(ThreadData threadData, byte[] contextBuffer, bool isFirst, bool skipFrames) | ||
| { | ||
| IPlatformAgnosticContext context = IPlatformAgnosticContext.GetContextForPlatform(_target); | ||
| context.FillFromBuffer(contextBuffer); | ||
|
|
| // CONTEXT requires 16-byte alignment for the OS GetThreadContext path | ||
| void* scratch = NativeMemory.AlignedAlloc(context.Size, IPlatformAgnosticContext.ContextAlignment); | ||
| try | ||
| { | ||
| Span<byte> buffer = new(scratch, (int)context.Size); |
|
|
||
| public override bool TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> bufferToFill) => throw new NotImplementedException(); | ||
| public override int TryGetThreadContext(ulong threadId, uint contextFlags, Span<byte> bufferToFill) => throw new NotImplementedException(); | ||
|
|
| if (address >= module.baseAddr && address + (uint)buffer.Length < module.baseAddr + module.size) | ||
| { | ||
| ulong moduleOffset = address - module.baseAddr; | ||
| Debug.Assert(cumulativeSize + moduleOffset < (ulong)moduleImageData.Length, "Address is out of bounds"); | ||
| ulong bytesToCopy = ulong.Min((uint)buffer.Length, module.size - moduleOffset); |
| if (address >= header->memoryBase && address + (uint)buffer.Length < header->memoryLimit) | ||
| { | ||
| ulong offset = address - header->memoryBase; | ||
| new Span<byte>((byte*)header + offset, buffer.Length).CopyTo(buffer); | ||
| return buffer.Length; | ||
| return (CdacHResults)buffer.Length; |
| Span<byte> seedSpan = new(pInternalContextBuffer, (int)ctx.Size); | ||
| CdacHResults result = _target.Contracts.StackWalk.GetContext(threadData, ThreadContextSource.Debugger, allFlags, seedSpan); | ||
| if (result.IsFailure()) | ||
| throw Marshal.GetExceptionForHR((int)result)!; |
| Span<byte> destination = new(pContextBuffer, (int)ctx.Size); | ||
| CdacHResults result = _target.Contracts.StackWalk.GetContext(threadData, ThreadContextSource.Debugger, allFlags, destination); | ||
| if (result.IsFailure()) | ||
| throw Marshal.GetExceptionForHR((int)result)!; |
| IPlatformAgnosticContext context = IPlatformAgnosticContext.GetContextForPlatform(_target); | ||
| context.FillFromBuffer(contextBuffer); |
| /// Expected values for ReadFromTargetDelegate: S_OK, E_FAIL | ||
| /// Expected values for WriteToTargetDelegate: S_OK, E_FAIL | ||
| /// Expected values for GetTargetThreadContextDelegate: E_NOTIMPL | ||
| /// Expected values for AllocVirtualDelegate: E_INVALIDARG |
| @@ -84,7 +84,8 @@ public interface IStackWalk : IContract | |||
| bool IsExceptionHandlingHelperInlinedCallFrame(TargetPointer frameAddress) => throw new NotImplementedException(); | |||
| DebuggerEvalData GetDebuggerEvalData(TargetPointer funcEvalFrameAddress) => throw new NotImplementedException(); | |||
| TargetPointer GetRedirectedContextPointer(ThreadData threadData) => throw new NotImplementedException(); | |||
| byte[] GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags) => throw new NotImplementedException(); | |||
| CdacHResults GetContext(ThreadData threadData, ThreadContextSource contextSource, uint contextFlags, Span<byte> contextBuffer) => throw new NotImplementedException(); | |||
| byte[] RetrieveHijackedContext(IStackDataFrameHandle stackDataFrameHandle, bool isUnhandledException) => throw new NotImplementedException(); | |||
|
The HResults looks weird on the stress log analyzer. I’m also not entirely sure how necessary it is to track the HR, that is, if we will likely ever encounter an HResult other than E_NOTIMPL that warrants a different response. I had only noticed it because of the alignment requirements not being met, but if those are met I’m not sure what else there is. |
IsRuntimeUnwindableStubDebugger cDAC APIRetrieveHijackedContextStackWalk cDAC APICreateStackWalkthat accepts an arbitrary seed contextTryGetThreadContextHRESULT for more accurate fallback handlingCreateStackWalk,DeleteStackWalk,SetStackWalkCurrentStackFrame,GetStackWalkCurrentStackFrame, andUnwindStackWalkFrameNote to Copilot code revise: Please DO NOT make ANY comments about breaking public APIs!!!